Basic graphs. Geospatial Visualization

Lecture 3

732A98

Density plots and box plots

What should be analysed?

  • Density plot, histogram, violin plots

    • Mean value or typical value
    • Symmetry
    • Variation
    • Whether reminds some distribution
    • Heavy/Light tailed
    • One ore more modes
    • Skewness

Density plots and box plots

What should be analysed?

  • Box plot

    • Median
    • Variation
    • Outliers
    • Symmetry
    • Quantiles

Density plots and box plots

Example: Visualizing miles per gallon depending on transmission type

1020300246810121015202530350.0000.0250.0500.0750.10001510152025303540
010101factor(am)

Scatter plot

  • Y: dependent variable, X: independent variable
  • Smoother is a good idea to have

Analysis:

  • Shape (data=true+error, true=linear, quadratic, cubic, exponential, .., empirical)
    • How to find the right model?
    • Fitting the data (regression)
    • Analysis of residuals or model selection methods
  • Strength (how close observations to a hypothized model)
    • If linear, Correlation r or coefficient of determination R2

Scatter plot

Analysis:

  • Direction (if monotonic, decreasing or increasing; if not monotonic, which parts increasing, which decreasing)
  • Density (dense areas, sparse areas)
  • Outliers
  • Clusters

Scatter plot

Example: Visualizing weight and rear axle ratio

3.03.54.04.55.0012345
dratwt

Scatter plot

  • More variables can be mapped
  • Mark shape
  • Mark size
  • Mark color
  • Mark orientation
  • Juxtaposed displays or superimposed displays

  • If juxtaposed displays used, we get
    scatterplot matrix

SPLOM

1020304002004000100200300102030246200400100200300246
MPGDisplacement#Horse PowerweightMPGDisplacement#Horse Powerweight

3D Surface plots and contour plots

  • Remember: interpolated data used
  • Analysis:
    • Peaks and draughts
    • Trends
    • Additivity
    • Always check the underlying data after

3D Surface plots and contour plots

15202530100200300400
2345wtmpgdisp
2345s$z

Geospatial data

  • Geographical coordinates are involved
  • Used in many applications
    • Climate modeling/analysis
    • Economic/social data analysis
    • Transaction data

Spatial phenomena

  • Point phenomena ( ex: building location, city location)
  • Line phenomena (paths, roads)
  • Area phenomena (counties)
  • Surface phenomenon (mountains)

Types of maps

  • Symbol/dot maps (nominal/ordinal point data)
  • Land use maps/Choropleth maps (nominal/ordinal area data)
  • Line diagrams (nominal/ordinal line data)
  • Isoline maps (ordinal surface data)
  • Surface maps (ordinal volume data)

  • Note: Different maps can be used for the same data
    • Choropleth map / Dot map
    • Density surface /dot map

What is map?

  • Map coordinates:
    • longitude λ=[180,180], negative=west
    • lattitude ϕ=[90,90], negative= south
  • Challenge: [λ,ϕ][x,y]

  • Different map projections
    • Conformal projection: retains angles (shapes) but not area
    • Equal area: retains areas but not angles (shapes)

What is map?

  • Cylindrical projection, plane projection and cone projection
  • Cylindrical projection used by Google, standard now

Cylindrical projection

  • Conformal projection: far northern/far southern areas inflated
  • Defined by x=λ,y=ϕ

Cone projection

  • Albers Equal-area projection
    • Preserves areas
    • Shapes or distances are not correct

Visual variables for spatial data


Symbol/Dot maps

  • Data= Lattitude, Longitude+ Other variables
  • Latt, Long->Coord, Other variables–>Visual aesthetics
    • Amount is limited! (perception problems)
  • Another approach: multiple parameters on multiple maps
Missing Mapbox GL JS CSS

Symbol/dot maps

  • Analysis:
    • Density in geogr areas and between geogr areas
    • Spatial pattern of density (north, south)
    • Clusters, outliers
  • Problems:
    • Overplotting in highly populated ares
    • If several observations have the same coordinate
    • Size aesthetics used–> perception problem
    • Perceived size depends on local neighborhood (Ebbinghaus illusion)
  • Color used: color perception problems

Symbol/dot maps

  • Problems:
    • Absolute vs relative mapping (proportional to population)

Line diagrams

  • Observation: set of (Latt, Long) pairs+ other variables
  • Often: start, end point
trace 0trace 1

Line diagrams:

  • Same as in network analysis plus
  • geographical relationships between links and their density (size)
    • Where dense links located?
    • How links are directed?
  • Problems:
    • Overplotting
    • If line length analysed -> length perception problem
    • If width analysed -> volume perception problem
    • Colors analysed ->color perception problem

Line diagrams

  • Overplotting - possible solution:
    • Using curved lines, minimize edge crossing

Visualizing area data

  • Data: Name/Coordinates of geographic area+ other variables
  • Choropleth maps: variables=color or shaded region on map
10 ° W5 ° W0 °50 ° N52 ° N54 ° N56 ° N58 ° N60 ° N62 ° N
0255075100Price

Choropleth maps

  • Analysis:
    • Find clusters of regions that are similar
    • Find unusual regions (compared to neighbor regions)
    • Find patterns on the map
  • Problems affecting perception:
    • Color/grayscale mapping
    • Choice of regions (county, state,…)
    • Larger region with the same color looks dominating
    • Patterns in small/densely populated areas hard to see

Choropleth maps



Choropleth maps

Visualizing area data

  • Isarithmic maps: show areas of phenomenon on the map (density)
    • Contour map
    • Topographic map

Software for geospatial visualization

  • Plenty of commercial/Noncommercial software
    • ArcGIS, Google, Yahoo, Microsoft map API
  • Plotly
    • plot_geo()
    • Using MapBox + plot_mapbox()
  • To use Mapbox:
    • Register with your email, find your token
    • Run in R Sys.setenv('MAPBOX_TOKEN' = 'your_mapbox_token_here')
  • Ggplot2
    • geom_sf()
    • ggmap

Using maps

  • A few countries available through plotly
  • Downloading map of a country:
    • Finding a country map http://gadm.org/
    • Decide what level of detalization is needed (region, county,…)
    • Download R(sf) file.
    • Load the file to R using readRDS function
      • e.g. rds <- readRDS('filename.rds')
    • Use with ggplot()+geom_sf(data=rds)
    • Use with Plotly: plot_ly()+add_sf(data=rds)

Finding locations

Read home

  • Chapter 6
  • Plotly book, ch 2.2, 2.4 and 2.5